What is claimed is: 

1. A dedicated, hardware-based Physics Processing Unit (PPU), comprising: 
a vector processor adapted to perform multiple, parallel floating point 

operations to generate physics data; and 

a data communication circuit adapted to communicate the physics data to a 

host. 

2. The PPU of claim 1, wherein the host comprises a Central Processing Unit 
(CPU), and the PPU further comprises: 

a PPU Control Engine (PCE) receiving commands from the CPU and 
controlling communication the physics data from the PPU to the host. 

3. The PPU of claim 2, wherein the PPU further comprises: 
an external memory and an internal memory; and 

a Data Movement Engine (DME) controlling the movement of data between 
the external memory and the internal memory in response to instructions received 
from the PCE. 

4. The PPU of claim 3, further comprising: 

a Floating Point Engine (FPE) performing multiple, parallel floating point 
operations on data stored in the internal memory. 

5. The PPU of claim 4, wherein the internal memory is operatively connected 
to the DME, and further comprising: 

a high-speed memory bus operatively connecting an external high-speed 
memory to at least one of the DME and the FPE. 

6. The PPU of claim 5, wherein the internal memory comprises multiple 
banks allowing multiple data threading operations. 
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7. The PPU of claim 3, wherein the PCE comprises control and 
communication software stored in a RISC core. 

8. The PPU of claim 5, wherein the internal memory comprises first and 
second banks, and wherein the DME further comprises: 

a first unidirectional crossbar connected to the first bank; 
a second unidirectional crossbar connected to the second bank; and, 
a bi-directional crossbar connecting first and second crossbars to the external 
high-speed memory. 

9. A dedicated, hardware-based Physics Processing Unit (PPU) connected 
within a system to a Central Processing Unit (CPU) and comprising: 

an external memory storing data; and, 

an Application Specific Integrated Circuit (ASIC) implementing a vector 
processor adapted to perform multiple, floating point operations. 

10. The PPU of claim 9, wherein the system comprises a Personal Computer 
(PC); and wherein the PPU comprises an expansion board adapted for incorporation 
within the PC, the expansion board mounting the ASIC and the external memory. 

11. The PPU of claim 10, further comprising circuitry enabling at least one 
data communications protocol between the PPU and CPU. 

12. The PPU of claim 1 1, wherein the at least one data communications 
protocol comprises at least one protocol selected from a group of protocols defined by 
USB, USB2, Firewire, PCI, PCI-X, PCI-Express, and Ethernet. 

13. The PPU of claim 11, wherein the ASIC comprises a PPU Control Engine 
(PCE) receiving commands from the CPU and controlling data communications 
between the PPU and PC. 
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14. The PPU of claim 13, wherein the ASIC further comprises: 
an internal memory; and 

a Data Movement Engine (DME) controlling the movement of data between 
the external memory and the internal memory in response to instructions received 
from the PCE. 

15. The PPU of claim 14, further comprising: 

a Floating Point Engine (FPE) performing multiple, parallel floating point 
operations on data stored in the internal memory. 

16. The PPU of claim 15, wherein the internal memory is operatively 
connected to the DME, and further comprising: 

a high-speed memory bus operatively connecting the external memory to at 
least one of the DME and the FPE. 

17. The PPU of claim 16, wherein the internal memory comprises multiple 
banks allowing multiple data threading operations. 

18. The PPU of claim 17, wherein the internal memory further comprises: 
an Inter-Engine memory transferring data between the DME and FPE. 

19. The PPU of claim 18, wherein the internal memory further comprises: 
a Scratch Pad memory. 

20. The PPU of claim 14, further comprising a command packet queue 
transferring command packets from the PCE to the DME. 

2 1 . The PPU of claim 1 5, wherein the FPE comprises a plurality of Vector 
Floating-point Units. 

22. The PPU of claim 21, wherein at least one of the command packets defines 
a vector length of variable length. 
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23. The PPU of claim 15, wherein the DME comprises a plurality of Memory 
Control Units (MCUs) and a Switch Fabric connecting the MCUs to the external 
memory; and, 

wherein the FPE comprises a plurality of Vector Processing Engines (VPEs) 
receiving date from at least one of the MCUs via a VPE bus. 

24. The PPU of claim 23, wherein each Vector Processing Engine (VPE) 
comprises a plurality of Vector Processing Units (VPUs) receiving data from the VPE 
bus. 

25. The PPU of claim 24, wherein each VPU comprises: 

a dual bank Inter-Engine Memory (IEM) receiving data from the VPE bus; 
one or more data registers receiving date from the IEM under the control of an 
associated Load/Store Unit; and 

an Execution Unit performing parallel floating point operations. 

26. The PPU of claim 23,wherein at least one command packet received from 
the PCE defines a vector length of variable length. 

27. The PPU of claim 23, wherein the Switch Fabric comprises at least one 
crossbar circuit. 

28. The PPU of claim 24, wherein each VPU is dynamically re-configurable. 
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